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TITLE OF THE INVENTION 

VIDEO ENCODING METHOD, VIDEO. DECODING METHOD, VIDEO 
ENCODING APPARATUS, VIDEO DECODING APPARATUS , VIDEO ENCODING 
PROGRAM, AND VIDEO DECODING PROGRAM 
5 BACKGROUND OF THE INVENTION 

Field of the Invention 
[0001] 

The present invention relates to a video encoding 
method, a video decoding method, a video encoding apparatus, 
10 a video decoding apparatus, a video processing system, a 

video encoding program, and a video decoding program. 

Related Background Art 
[0002] 

Conventionally, video signal encoding techniques are 
15 used for transmission, storage, and regeneration of video 

signals. The well-known techniques include, for example, 
the international standard video codingmethods such as ITU-T 
Recommendation H.263 (hereinafter referred to as "H.263") , 
ISO/IEC International Standard 14496-2 (MPEG-4 Visual, 
20 hereinafter referred to as "MPEG-4")/ and so on. 

[0003] 

Another known newer encoding system is a video coding 
method scheduled for joint international standardization 
by ITU-T and ISO/IEC; ITU-T Recommendation H.2 64 and ISO/IEC 
25 International Standard 14 4 96-10 (Joint Final Committee Draft 

of Joint Video Specification, hereinafter referred to as 
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"H-2 6L") . Concerning the general coding techniques used in 
these video coding methods, reference should be made, for 
example, to nonpatent Document 1 presented below, 

[0004] 

5 [Nonpatent Document 1] 

Basic Technologies on International Image Coding 
Standards 

(co-authored by Fumitaka Ono and Hiroshi Watanabe and 
published March 20, 1998 by CORONA PUBLISHING CO., LTD.) 
10 [0005] 

A motion video signal consists of a series of images 
( frames) varying littleby little with time . For this reason, 
it is common practice in these video coding methods to. 
implement interframe prediction between a frame retrieved 
15 as a target for encoding (current frame) and another frame 

(reference frame) and thereby reduce temporal redundancy 
in the video signal. 
[0006] 

In this case, where the interframe prediction is carried 
20 out between the current frame and a reference frame with 

smaller difference from the current frame, the redundancy 
can be reduced more and encoding efficiency can be increased . 
For this reason, the reference frame can be either a temporally 
previous frame or a temporally subsequent frame with respect 
25 to the current frame. The prediction with reference to the 

previous frame is referred to as forward prediction, while 
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the prediction with reference to the subsequent frame as 
backward prediction (cf . Fig. 1) . Bidirectional prediction 
is defined as aprediction inwhichone is arbitrarily selected 
out of the two prediction methods is arbitrarily selected, 
5 or in which the both methods are used simultaneously. 

[0007] 

In general, with use of such bidirectional prediction, 
a temporally previous frame as a reference frame for forward 
prediction and a temporally subsequent frame as a reference 
10 frame for backward prediction are each stored each in a frame 

buffer, prior to the current frame. 
[0008] 

For example, in decoding of MPEG-4, where the current 
frame is decoded by bidirectional interframe prediction, 

15 a temporally previous frame and a temporally subsequent frame 

with respect to the current frame are first decoded as either 
frames decoded by intraframe prediction without use of 
interframe prediction, or as frames decoded by forward 
interframe prediction, prior to decoding of the current 

20 frame, and they are stored as reference frames into the frame 

buffer. Thereafter, the current frame is decoded by 
bidirectional prediction using these two frames thus stored 
(cf . Fig. 2 (a) ) . 
[0009] 

25 In this case, therefore, the order of decoding times 

of the temporally subsequent reference frame and the current 
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frame is reverse to the order of output times of the respective 
decoded images thereof. Each of these frames is attached 
with information indicating its output time, and thus the 
temporal order of the frames can be known according to this 
5 information. For this reason, the decoded images are 

outputted in the correct order (cf. Fig. 2(b)) . InMPEG-4, 
the output times are described as absolute values. 
[0010] 

Some of the recent video coding methods permit the 
10 foregoing interframe prediction to be carried out using 

multiple reference frames, instead of one reference frame 
in the forward direction and one reference frame in the 
backward direction, so as to enable prediction from a frame 
with a smaller change from the current frame (cf . Fig. 3) . 
15 [0011] 

For example, in decoding of H.26L, a plurality of 
reference frames within the range up to the predetermined 
maximum number of reference frames are retained in the frame 
buffer and an optimal reference frame is arbitrarily 

20 designated out of them on the occasion of implementing 

interframe prediction . In this case, where the current frame 
is decoded as a bidirectionally predicted frame, reference 
frames are first decoded prior to decoding of the current 
frame; a plurality of temporally previous frames and a 

25 plurality of temporally subsequent frames with respect to 

the current frame are decoded each as reference frames and 
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retainedas reference frames in the framebuf f er . Thecurrent 
frame can be predicted from a frame arbitrarily designated 
as one used for prediction out of those frames (cf. Fig, 
4 (a) ) . 
5 [0012] 

In this case, therefore, the order of decoding times 
of the temporally subsequent reference frames and the current 
frame becomes reverse to the order of output times thereof. 
Each of these frames is attached with information indicating 

10 its output time or with information indicating the output 

order, and the temporal order of the frames can be known 
according to this information. For this reason, the decoded 
images are outputted in the correct order (cf. Fig. 4(b) ) . 
The output times are often described as absolute values. 

15 The output order is used where frame intervals are constant. 

[0013] 

In the case where the multiple reference frames are 
also used in bacicward prediction, as described above, the 
frames retained in the frame buffer are not always used in 

20 bacicward prediction for frames after the current frame. An 

example of this case will be described with reference to 
the predictive structure shown in Fig. 5. Let us assume that 
the current frame Fl is backward predicted from a temporally 
subsequent reference frame F2, F2 from F3, and F3 from F4 

25 and that F4 is forward predicted from a temporally previous 

reference frame FO. Such predictions are carried out as 
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efficient prediction operation, for example, in the case 
where a change is large between the temporally previous 
reference frame FO and the current frame Fl, while changes 
are small between Fl and the temporally subsequent reference 
5 frames F2, F3, F4, and a change is relatively small between 

FO and F3 . 
[0014] 

In this case, the current frame Fl is predicted from 
only the temporally subsequent reference frame F2, and thus 

10 F3 and F4 are frames that are not used for interframe prediction 

at the time of decoding the current frame Fl . However, since 
F3 and F4 are temporally subsequent frames after the current 
frame Fl, they need to be continuously retained before they 
are outputted as decoded images at their respective output 

15 times. 

[0015] 

When the temporally subsequent frames are retained for 
the backward prediction in the frame buffer in this way, 
such frames are classified into two types, those used as 

20 reference frames and those not used as reference frames in 

the interframe prediction after the current frame. In the 
description hereinafter, the frames not used as reference 
frames but retained in the frame buffer before the coming 
of their output times will be referred to as "output queueing 

25 frames." 

[0016] 
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In order to explain the difference of the frames, 
schematic illustrations of a configuration of a video 
decoding device are presented in Fig. 6(a) and Fig. 6(b) . 
As shown in Fig. 6(a)/ the decoding device 1 is provided 
5 with frame buffer 3 for retaining reference frames, and the 

frame buffer 3 outputs a reference frame to decoding processor 
2 in execution of interf rame prediction. In this case, where 
a plurality of reference frames are used in backward 
prediction as described above, the frame buffer retains both 

10 the reference frames and output queueing frames and, from 

a logical aspect, as shown in Fig. 6(b) , there exist an area 
for storing frames continuously retained as reference frames 
for a fixed time and also outputted to the decoding processor 
2 , and an area for storing frames not outputted to the decoding 

15 processor 2 but continuously retained before outputted as 

decoded images at output times of the respective frames. 
[0017] 

Incidentally, in the case of the multiple reference 
frames being used, for example, if there is a frame having 

20 a peculiar feature in a certain moving picture and having 

large changes from the other frames, no effective prediction 
can be expected even if that frame is kept retained as a 
reference frame. Therefore, the interf rame prediction can 
be performed more efficiently in certain cases by stopping 

25 retaining such frames as reference frames and allowing the 

frame buffer to retain other frames by just that much. 



7 



FP03-0158-00 



Conversely, in the case where a frame has a typical feature 
in a certain moving picture and has small changes from the 
other frames, the interframe prediction can be expected to 
be carried out efficiently for many frames if such a frame 
5 is retained as a reference frame in the frame buffer for 

a long period, regardless of the temporal distance from the 
current frame . 
[0018] 

In order to substantialize such eclectic operation of 
10 reference frames, it is conceivable to announce eclectic 

information of the reference frames by encoded data. For 
example, inH. 26L, MemoryManagement Control Operation (MMCO) 
commands ^are defined. The MMCO commands include, for 
example, definitions of a Reset command capable of providing 
15 an instruction to eliminate use of all the reference frames 

retained in the frame buffer, and other commands, and it 
is possible to arbitrarily provide an instruction to choose 
any frame to be retained as a reference frame in the frame 
buffer as occasion demands. 
20 [0019] 

For starting decoding from the middle of encoded data 
in order to make random access on a moving picture, necessary 
conditions are that a start frame to be decoded is a frame 
encoded by intraf rame prediction without use of interframe 
25 prediction from another frame and that frames after the start 

frame do not use any previous frame before the decoding-start 
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frame, as a reference frame, i.e., an instruction to eliminate 
use of all the reference frames retained in the frame buffer 
needs to be given prior to the decoding of the decoding-start 
frame • 
5 [0020] 

For example, inH.26L, an Instantaneous Decoder Refresh 
(IDR) picture is defined in order to clearly specify such 
a state. With the IDR picture, use is eliminated of all the 
previous reference frames before decoding of the IDR picture 

10 and interframe predictions for frames thereafter are those 

not referring to the frames before the IDR picture. This 
permits decoding to be carried out in the middle of encoded 
data, like in random access, without facing the problem of 
presence/absence of the reference frame when decoding start 

15 from an IDR picture . 

SUMMARY OF THE INVENTION 
[0021] 

In the case where a plurality of reference frames are 
used in baclcward interframe prediction, as described above, 

20 the plurality of reference frames can be efficiently handled 

by providing a means for controlling retention of the 
reference frames in the frame buffer. On the other hand, 
this frame buffer retains both the reference frames used 
for interframe prediction and the output queueing frames 

25 retained before the coming of their output times, but no 

consideration was given to control of such output queueing 
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frames in the conventional means for controlling the 
retention of reference frames. It posed a problem that the 
output queueing frames which are not scheduled for output 
occupy the frame buffer for a long period of time. 
5 [0022] 

For this reason, if the control of output queueing 
frames is not properly carried out, there will arise a problem 
that output of decoded images cannot be uniquely handled, 
or the like. 

10 For example, let us suppose that switching of video 

is effected at the time of F03 from a moving picture consisting 
of encoded data F00-F04 in Fig. 7(a) to a moving picture 
consisting of encoded data F10-F13 in Fig. 7(b) . This can 
occur, for example, in the case where encoded data stored 

15 for editing of motion video is switched in frame units, or 

in the case where in broadcast type video communication, 
certain encoded data corresponding to one channel is switched 
to another encoded data corresponding to another channel. 
[0023] 

20 In this case, FIO needs to be an IDR picture, which 

eliminates use of all the reference frames retained in the 
frame buffer, prior to the decoding of FIO, and which permits 
the decoding of frames after FIO to be carried out without 
any problem. However, the frames which are eliminated from 

25 use in this case are reference frames, and thus unique handling 

is not allowed if F03 and F04 are not reference frames but 



10 



FP03-0158-00 



output queueing frames. Namely/ it is not clear in the case 
of switching to FIO whether F03 and F04 are frames to be 
continuously retained in the frame buffer and to be outputted, 
or frames to be handled as frames eliminated from use prior 
5 to the decoding of FIO. Accordingly, which frame should be 

outputted as a decoded image in this case is not uniquely 
determined. 
[0024] 

An object of the present invention is, therefore, to 
10 clearly define handling of frames retained in the framebuf f er 

in use of multiple reference frames when using multiple 
reference frame in backward interframe prediction. 
[0025] 

In order to solve the above problem, a video encoding 
15 method according to the present invention is a video encoding 

method for video encoding apparatus to implement backward 
interframe prediction from a temporally subsequent frame, 
the video encoding method comprising: output ting information 
indicating that an option to eliminate use of a decoded image 
2 0 of the temporally subsequent frame was chosen. 

[0026] 

In the video encoding method according to the present 
invention, preferably, the information is information 
providing an instruction to eliminate use of every frame, 
25 about the decoded image of the temporally subsequent frame. 

In the video encoding method according to the present 
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invention, preferably, the information is information 
providing an instruction to eliminate use of every frame 
not used as a reference frame for backward prediction, about 
the decoded image of the temporally subsequent frame. 
5 In the video encoding method according to the present 

invention, preferably, the information includes information 
indicating an output time about a decoded image of a frame 
use of which is eliminated. 
[0027] 

10 A video decoding method according to the present 

invention is a video decoding method for video decoding 
apparatus to implement backward interframe prediction from 
a temporally subsequent frame, the video decoding method 
comprising: in conjunction with input of information for 

15 eliminating use of a decoded image of the temporally 

subsequent frame, avoiding outputting the decoded image of 

the frame on the basis of the information. 

[0028] 

In the video decoding method according to the present 
20 invention, preferably, the information is information 

providing an instruction to eliminate use of every frame, 
about the decoded image of the temporally subsequent frame. 

In the video decoding method according to the present 
invention, preferably, the information is information 
25 providing an instruction to eliminate use of every frame 

not used as a reference frame for backward prediction, about 
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In the video decoding method according to the present 
invention, preferably, the information includes information 
indicating an output time about a decoded image of a frame 
5 use of which is eliminated. 

[0029] 

A video encoding apparatus according to the present 
invention is a video encoding apparatus for implementing 
backward interf rame prediction from a temporally subsequent 
10 frame, the video encoding apparatus being configured to: 

output information indicating that an option to eliminate 
use of a decoded image of the temporally subsequent frame 
was chosen. 
[0030] 

15 In the video encoding apparatus according to the present 

invention, preferably, the information is information 
providing an instruction to eliminate use of every frame, 
about the decoded image of the temporally subsequent frame. 

In the video encoding apparatus according to the present 

20 invention, preferably, the information is information 

providing an instruction to eliminate use of every frame 
not used as a reference frame for backward prediction, about 
the decoded image of the temporally subsequent frame. 

In the video encoding apparatus according to the present. 

25 invention, preferably, the information includes information 

indicating an output time about a decoded image of a frame 
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use of which is eliminated. 
[0031] 

A video decoding apparatus according to the present 
invention is a video decoding apparatus for implementing 
5 backward interf rame prediction from a temporally subsequent 

frame, the video decoding apparatus being configured to: 
in conjunction with input of information for eliminating 
use of a decoded image of the temporally subsequent frame, 
avoid outputting the decoded image of the frame on the basis 
10 of the information. 

[0032] 

In the video decoding apparatus according to the present 
invention, preferably, the information is information 
providing an instruction to eliminate use of every frame, 
15 about the decoded image of the temporally subsequent frame. 

In the video decoding apparatus according to the present 
invention, preferably, the information is information 
providing an instruction to eliminate use of every frame 
not used as a reference frame for bacJcward prediction, about 
20 the decoded image of the temporally subsequent frame. 

In the video decoding apparatus according to thepresent 
invention, preferably, the information includes information 
indicating an output time about a decoded image of a frame 
use of which is eliminated. 
25 [0033] 

A video encoding program according to the present 
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invention is a video encoding program for video encoding 
apparatus as a computer for implementing backward interframe 
prediction from a temporally subsequent frame, the video 
encoding program letting the video encoding apparatus 
5 substantialize: a function of outputting information 

indicating that an option to eliminate use of a decoded image 
of the temporally subsequent frame was chosen. 
[0034] 

In the video encoding program according to the present 
10 invention, preferably, the information is information 

providing an instruction to eliminate use of every frame, 
about the decoded image of the temporally subsequent frame. 

In the video encoding program according to the present 
invention, preferably, the information is information 
15 providing an instruction to eliminate use of every frame 

not used as a reference frame for backward prediction, about 
the decoded image of the temporally subsequent frame. 

In the video encoding program according to the present 
invention, preferably, the information includes information 
20 indicating an output time about a decoded image of a frame 

use of which is eliminated. 
[0035] 

A video decoding program according to the present 
invention is a video decoding program for video decoding 
25 apparatus as a computer for implementing backward interframe 

prediction from a temporally subsequent frame, the video 
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decoding program letting the video decoding apparatus 
substantialize: a function of, in conjunction with input 
of information for eliminating use of a decoded image of 
the temporally subsequent frame, avoiding outputting the 
5 decoded image of the frame on the basis of the information. 

[0036] 

In the video decoding program according to the present 
invention, preferably, the information is information 
providing an instruction to eliminate use of every frame, 
10 about the decoded image of the temporally subsequent frame. 

In the video decoding program according to the present 
invention, preferably, the information is information 
providing an instruction to eliminate use of every frame 
not used as a reference frame for backward prediction, about 
15 the decoded image of the temporally subsequent frame. 

In the video decoding program according to the present 
invention, preferably, the information includes information 
indicating an output time about a decoded image of a frame 
use of which is eliminated. 
20 [0037] 

According to these aspects of the invention, the 
encoding operation is configured to output the information 
providing the instruction to eliminate use of output queueing 
frames to be outputted as decoded images, and the decoding 
25 operation is configured to retrieve this information and 

output neither of the output queueing frames. This prevents 
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an unnecessary decoded image f rombeing outputted in decoding 
of encoded data after edit or in decoding after random access 
on encoded data, and also prevents a necessary decoded image 
output from being deleted, conversely. As a consequence, 
5 an appropriate decoded image output can be obtained. 

[0038] 

Similar effect can also be achieved with construction 
of a video processing system comprising the foregoing video 
encoding apparatus and the foregoing video decoding apparatus 
10 wherein the video decoding apparatus decodes data encoded 

by the video encoding apparatus. 
[0039] 

A video encoding apparatus according to the present 
invention is a video encoding apparatus comprising: input 

15 means for effecting input of an image as a target for encodings- 

encoding means for encoding the image to generate encoded 
data; image storage means for storing an image regenerated 
after encoded by the encoding means; and buffer management 
means for managing every image stored in the image storage 

20 means, wherein, on the occasion of encoding an image (IDR 

image) encoded without reference to any other image, the 
buffer management means outputs along with the encoded data, 
a flag (no_output_of_prior_pics_f lag) indicating whether 
use is eliminated of every image previously stored in the 

25 image storage means. For example, the video encoding 

apparatus outputs the flag of "1" in the case of eliminating 
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use of every image or outputs the flag of "0" in the other 

case, along with the encoded data. 

[0040] 

In the video encoding apparatus according to the present 
5 invention, preferably, the encoding means implements 

backward interf rame prediction from a temporally subsequent 
frame, and, on the occasion of encoding the image (IDR image) 
encoded without reference to any other image, the buffer 
management means deletes a decoded image of every temporally 
10 subsequent frame previously stored in the image storage 

means . 
[0041] 

A video decoding apparatus according to the present 
invention is a video decoding apparatus comprising: input 

15 means for effecting input of image data containing encoded 

data of an encoded image, and an image output instruction 
flag added to the encoded data; decoding means for decoding 
the encoded data to generate a regenerated image; image 
storage means for storing the regenerated image; and buffer 

20 management means for managing every regenerated image stored 

in the image storage means, wherein the buffer management 
means deletes every image stored in the image storage means, 
in accordance with the image output instruction flag 
corresponding to an image (IDR image) encoded without 

25 reference to any image stored in the image storage means. 

[0042] 
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In the video decoding apparatus according to the present 
invention, preferably, where the image output instruction 
flag is "0, " use is eliminated of every reference image in 
a buffer, and where the flag is "1," every reference image 
5 and every output queueing image in the buffer are deleted. 

[0043] 

A video encoding method according to the present 
invention is a video encoding method comprising: an input 
step wherein a video encoding apparatus effects input of 

10 an image as a target for encoding; an encoding step wherein 

the video encoding apparatus encodes the image to generate 
encoded data; an image storage step wherein the video encoding 
apparatus stores an image regenerated after encoded in the 
encoding step, into image storage means; and a buffer 

15 management step wherein the video encoding apparatus manages 

every image stored in the image storage means, wherein in 
the buffer management step, on the occasion of encoding an 
image encoded without reference to any other image, the video 
encoding apparatus outputs along with the encoded data, a 

20 flag indicating whether use is eliminated of every image 

previously stored in the image storage means- 
[0044] 

In the video encoding method according to the present 
invention, preferably, the encoding step is configured so 
25 that the video encoding apparatus performs backward 

interframe prediction from a temporally subsequent frame. 
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and the buffer management step is configured so that, on 
the occasion of encoding the image encoded without reference 
to any other image, the video encoding apparatus deletes 
a decoded image of every temporally subsequent frame 
5 previously stored in the image storage means. 

[0045] 

A video decoding method according to the present 
invention is a video decoding method comprising: an input 
step wherein a video decoding apparatus effects input of 

10 image data containing encoded data of an encoded image, and 

an image output instruction flag added to the encoded data; 
a decoding step wherein the video decoding apparatus decodes 
the encoded data to generate a regenerated image; an image 
storage step wherein the video decoding apparatus stores 

15 the regenerated image into image storage means; and a buffer 

management step wherein the video decoding apparatus manages 
every regenerated image stored in the image storage means, 
wherein in the buffer management step, the video decoding 
apparatus deletes every image stored in the image storage 

20 means, in accordance with the image output instruction flag 

corresponding to an image encoded without reference to any 
image stored in the image storage means . 
[0046] 

In the video decoding method according to the present 
25 invention, preferably, the video decoding apparatus 

eliminates use of every reference image in the buffer when 
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the image output instruction flag is "0," or deletes every 
reference image and every output queueing image in the buffer 
when the flag is "1." 
[0047] 

5 A video encoding program according to the present 

invention is a video encoding program for letting a video 
encoding apparatus substantialize: a function of effecting 
input of an image as a target for encoding; a function of 
encoding the image to generate encoded data; a function of 

10 storing an image regenerated after encoded, into image 

storage means; a function of managing every image stored 
in the image storage means; and a function of outputting 
along with the encoded data, a flag indicating whether use 
is eliminated of every image previously stored in the image 

15 storage means, on the occasion of encoding an image encoded 

without reference to any other image- 
[0048] 

In the video encoding program, preferably, the video 
encoding apparatus is made to further substantialize a 

20 function of implementing backward interf rame prediction from 

a temporally subsequent frame, and a function of deleting 
a decoded image of every temporally subsequent frame 
previously stored in the image storage means, on the occasion 
of encoding the image encoded without reference to any other 

2 5 image. 

[0049] 
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A video decoding program according to the present 
invention is a video decoding program for letting a video 
decoding apparatus substantialize: a function of effecting 
input of image data containing encoded data of an encoded 
5 image, and an image output instruction flag added to the 

encoded data; a function of decoding the encoded data to 
generate a regenerated image; a function of storing the 
regenerated image into image storage means; a function of 
managing every regenerated image stored in the image storage 
10 means; and a function of deleting every image stored in the 

image storage means, in accordance with the image output 
instruction flag corresponding to an image encoded without 
reference to any image stored in the image storage means. 
[0050] 

15 In the video decoding program according to the present 

invention, preferably, the video decoding apparatus is made 
to further substantialize a function of eliminating use of 
every reference image in the buffer when the image output 
instruction flag is "0," or deleting every reference image 

20 and every output queueing image in the buffer when the flag 

is "1." 
[0051] 

The present invention will become more fully understood 
from the detailed description given herein below and the 
25 accompanying drawings which are given by way of illustration 

only, and thus are not to be considered as limiting the present 
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invention . 

Further scope of applicability of thepresent invention 
will become apparent from the detailed description given 
hereinafter. However, it should be understood that the 
5 detailed description and specific examples, while indicating 

preferred embodiments of the invention, are given by way 
of illustration only, since various changes andmodif ications 
within the spirit and scope of the invention will become 
apparent to those skilled in the art from this detailed 
10 description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig, 1 is a diagram for explaining the forward 
prediction and backward prediction in interf rame prediction . 

Fig. 2(a) is a diagram showing an example in which the 
15 current frame is decoded by bidirectional prediction using 

two frames. Fig. 2(b) is a diagram showing an example in 
which the order of decoding times of the temporally subsequent 
reference frame and the current frame is reverse to the order 
of output times of their respective decoded images. 
20 Fig. 3 is a diagram for explaining the forward 

prediction and backward prediction with use of multiple 
reference frames. 

Fig. 4(a) is a diagram showing an example in which 
interf rame prediction with multiple reference frames within 
25 the range up to the predetermined maximum number of reference 

frames is carried out. Fig. 4(b) is a diagram showing an 
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example in which the order of decoding times of the multiple 
temporally subsequent reference frames and the current frame 
is reverse to the order of output times of their respective 
decoded images. 

5 Fig. 5 is a diagram showing the predictive structure 

in the case where the frames retained in the frame buffer 
are not used for backward prediction after the current frame . 

Fig. 6(a) is a diagram schematically showing the 
configuration of the video decoding device. Fig. 6(b) is 
10 a diagram schematically showing the configuration of the 

frame buffer. 

Fig. 7 (a) is a diagram showing an example of a moving 
picture before switching consisting of encoded data. Fig. 
7 (b) is a diagram showing a moving picture after switching 
15 consisting of encoded data. 

Fig. 8 is a block diagram showing the functional 
configuration of the video processing system according to 
the present invention. 

Fig. 9 is a diagram showing the configuration of the 
20 video processing program according to the present invention . 

Fig. 10 is a block diagram showing the configuration 
of the video encoding device according to the present 
invention. 

Fig. 11 is a flowchart for explaining the process in 
25 which the controller determines 

no_output_of _prior_pics_f lag . 
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Fig. 12 is a block diagram showing the configuration 
of the video decoding device according to the present 
invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
5 [0052] 

Embodiments of the present invention will be described 
below in detail with reference to the drawings* 

The description will be given on the presumption that 
the encoding and decoding operations are implemented on the 
10 basis of H.26L and that portions without particular 

description in the operation of video coding are pursuant 

to the operation in H.2 6L. 

[0053] 

First, a configuration of a video processing system 
15 according to the present invention will be described. Fig. 

8 is a block diagram showing a functional configuration of 
video processing system 100. As shown in Fig. 8, the video 
processing system 100 is comprised of video encoding device 
10 and video decoding device 20. The video encoding device 
20 10 and video decoding device 20 are configured to perform 

the backward interfrauxie prediction from a temporally 
subsequent frame. 
[0054] 

The detailed processing will be described later, but 
25 the video encoding device 10 is provided with chosen 

information output part 11, and the chosen information output 
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part 11 outputs information indicating that an option to 
eliminate use of a decoded image of a temporally subsequent 
frame was chosen, to the video decoding device 20. 

The video decoding device 20 is provided with decoded 
5 image use eliminating part 21, and the decoded image use 

eliminating part 21 is configured so that, in conjunction 
with input of information for eliminating use of a decoded 
image of a temporally subsequent frame from the video encoding 
device 10, the decoded image use eliminating part 21 avoids 
10 outputting the decoded image of the frame on the basis of 

the information. 
[0055] 

The above information is, for example, information 
providing an instruction to eliminate use of every frame, 

15 for each decoded image of a temporally subsequent frame. 

For example, this information is information providing an 
instruction to eliminate use of every frame not used as a 
reference frame for backward prediction, for each decoded 
image of a temporally subsequent frame. Furthermore, for 

20 example, this information includes information indicating 

an output time about a decoded image of a frame use of which 
is eliminated. 
[0056] 

Subsequently, a video encoding method and a video 
25 decoding method according to the present invention will be 

described. In the present embodiment, information capable 
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of providing an instruction to eliminate use of a frame 
retained in the frame buffer is defined, and it is clearly 
defined for each of reference frames and output queueing 
frames . 
5 [0057] 

In the present embodiment the information is defined 
in the NAL unit syntax which carries the information about 
NAL (Network Abstraction Layer) , the units for transmission 
encoded data in H.26L. Specifically, it is defined as a type 
10 that is announced by a NAL unit type included in the NAL 

unit syntax. 
[0058] 

In the present embodiment, it is first regarded that 
with each conventional IDR picture an instruction is provided 
15 as one to eliminate use of only every reference frame retained 

in the frame buffer. Namely, the definition here is that 
use of any output queueing frame retained in the frame buffer 
is not eliminated with an IDR picture. 
[0059] 

20 Then an Instant Buffer Refresh picture (hereinafter 

referred to as an "IBR picture") is defined as a new picture 
different from the IDR picture. The IBR picture is defined 
as a picture that provides an instruction to eliminate use 
of every output queueing frame, in addition to the definition 

25 of the IDR picture providing the instruction to eliminate 

use of only every reference frame retained in the frame buffer . 
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Namely, when the current frame is determined to be an IBR 
picture, use of every reference frame and every output 
queueing frame retained in the frame buffer is eliminated, 
prior to decoding of the current frame. 
5 [0060] 

In the encoding, the video encoding device 10 provides 
an instruction of such an IBR picture in the case as described 
below. For example, in the case where an IDR picture used 
to be designated as a random access point and where output 
10 queueing frames prior thereto are not desired to be retained, 

regardless of delay by backward prediction, the pertinent 
frame can be designated as an IBR picture. 
[0061] 

In the decoding, where the current frame is an IBR 
15 picture, the video decoding device 20 eliminates use of every 

reference frame and every output queueing frame in the frame 
buffer, prior to decoding of the current frame, and regards 
that they will not be used for the interframe prediction 
and for output of their decoded images after the current 
2 0 frame. 

[0062] 

The IBR picture may also be configured to eliminate 
use of only limited output queueing frames to be unused, 
instead of eliminating the use of all the output queueing 
25 frames. 

Then the device may be arranged to determine whether 
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use of each frame is to be eliminated, using a decoded image 
output time attached to the IBR picture, and to eliminate 
use of only output queueing frames having their respective 
output times greater than the attached output time. 
5 [0063] 

As another example, the system may also be configured 
to convey an output time separately, eliminate use of only 
output queueing frames having their output times greater 
than the conveyed output time, and designate each unnecessary 
10 output queueing frame, based on a decoded image output time 

correlated with that frame. 
[0064] 

In this case, there arises a need for a syntax to convey 
the output time, and it may be configured, for example, as 

15 a syntax that exists only with the NAL unit type indicating 

the IBR picture in the NAL unit syntax and that conveys the 
time along with the IBR picture. For example, we can define 
"latest_output_time" which is conveyed subsequent to the 
NAL unit type where theNALunit type indicates the IBRpicture . 

20 Here the latest_output_time is assumed to use the same time 

unit used in the other syntaxes indicating the time in H.2 6L 
and to indicate a maximiam delay time by the time unit of 
90 kHz. It is also regarded that a numeral indicated in the 
time unit is encoded by 32-bit unsigned fixed-length codes 

25 and encoded data is then transmitted. 

[0065] 
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In the decoding, when receiving the 
latest_output_tiine, the video decoding device 20 eliminates 
use of only frames whose decoded image output times are greater 
than the latest_output_time, among the output queueing frames 
5 retained in the frame buffer. 

[0066] 

In the case where the latest_output_time is used in 
this way, the system may also be configured so that a flag 
is further conveyed before it and the syntax with the 
10 latest_output_time is used only when there is an indication 

by the flag. In this case, it is possible to define an option 
to eliminate use of all the output queueing frames when the 
latest_output_time is omitted. 
[0067] 

15 In the present embodiment the IBR picture is defined 

as a picture different from the IDR picture, but it is also 
possible to define a flag attached to the IDR picture and 
indicating how to handle the output queueing frames 
(no_output_of_prior__pics_f lag) , instead of defining the new 

20 picture like the IBR picture, and provide the IDR picture 

with the same definition as the IBR picture, because the 
difference between the IDR picture and the IBR picture is 
the difference in how to handle the output queueing frames 
retained in the frame buffer. 

25 [0068] 

Namely, just lilce the aforementioned definition, the 
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IDR picture is defined as a picture providing an instruction 
to eliminate use of only all the reference frames retained 
in the frame buffer but not to eliminate use of the output 
queueing frames. In this case, a flag attached to the IDR 
5 picture is separately conveyed for the IDR picture^ and with 

an indication by the flag (no_output_of_prior_pics_f lag = 
1), the IDR picture, similar to the aforementioned IBR 
picture, is handled as one providing an instruction to 
eliminate use of all the output queueing frames, in addition 
10 to all the reference frames retained in the frame buffer. 

This announcement makes it feasible to provide the 
instruction to eliminate use of the output queueing frames 
as well, without defining any new picture. 
[0069] 

15 The present embodiment will be further described with 

reference to Figs. 10-12. 

Fig. 10 is a block diagram showing the configuration 
of video encoding device 10 in the present embodiment. The 
conditions for video coding are entered through input 

20 terminal 47. In general, the encoding conditions are 

selected or entered through a keyboard. The encoding 
conditions are, for example, the size of the image as a target 
for encoding, the frame rate, and the bit rate. Further 
encoding conditions include a predictive reference structure 

25 of video and a capacity of buffer 44. 

[0070] 
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The predictive reference structure of video is, for 
example, timing of encoding an input image as an IDR image, 
an image referred to by a predictively encoded image, or 
the like. The capacity of buffer 44 is, for example, the 
5 number of images temporarily stored as reference images. 

These conditions may be set so as to vary with time. The 
encoding conditions thus entered are stored in controller 
48. 

[0071] 

10 With a start of an encoding operation, the controller 

48 outputs the encoding conditions to encoder 43, where the 
encoding condition are set. On the other hand, an image as 
a target for encoding is entered through input terminal 41, 
is then supplied via frame memory 42 to encoder 43, and 

15 thereafter is encoded. Since the order of images is switched 

on the occasion of backward prediction, the input image is 
temporarily stored in the frame memory 42. 
[0072] 

The encoder 43 encodes the image on the basis of the 
20 algorithm of H. 26L . The encoded image is sent to multiplexer 

45 to be multiplexed with other related information and 
thereafter the multiplexed data is outputted through output 
terminal 46. The image used for prediction is reproduced 
by encoder 43 and then is stored as a reference image for 
25 encoding of the next image, into the buffer 44. 

[0073] 
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Fig. 11 is a flowchart for explaining a process in which 
the controller 4 8 determines no_output_of_prior_pics_f lag . 
This process is executed for each of images constituting 
a video as a target for encoding. The first step SI is to 
5 effect input of the encoding conditions for an image as a 

target for encoding (whether the image is to be encoded as 
an IDR image) and input of information about whether a 
regenerated image previously stored in the buffer 44 is used 
as a reference image or as a displayed image. Such 
10 information is separately managed by the controller 48. 

[0074] 

S2 is to determine whether the image as a target for 
encoding is an IDR image. When the result of the 
determination is that the image is not an IDR image (S2; 
15 NO) , this process is ended (S3) . When the image is an IDR 

image (52; YES), the flow moves to S4. S4 is to determine 
whether the image previously stored in the buffer 44 are 
used as reference images or as a displayed images. 
[0075] 

20 When the result of the above determination is that the 

images previously stored in the buffer 44 are used as a 
reference image or as a displayed image (S4; YES) , 
no_output_of_prior_pics_f lag is set to "0" (S5) . On the 
other hand, when it is not used neither as a reference image 

25 nor as a displayed image (S4; NO), 

no_output_of_prior_pics_f lag is set to "1" (S6) . The flag 
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of no_output_of__prior_pics___f lag thus set is supplied through 
bus L8 to the multiplexer 45, and thereafter is added to 
encoded data of the corresponding image/ and the data is 
sent out through output terminal 46 (S7) . When at S6 
5 no__output_of__prior_pics_f lag is set to "1," the controller 

4 8 outputs a command to delete all the images stored in the 
buffer 44, through L7 . 
[0076] 

Fig. 12 is a block diagram showing the configuration 
10 of the video decoding device 20 in the present embodiment. 

Data to be decoded is entered through input terminal 51. 
This data is the image data of each frame encoded by the 
video encoding device 10. When this data is an IDR image, 
no_output_of_prior_pics_f lag is multiplexed therewith. 
15 The input data is stored in input buffer 52. At a 

predetermined time, data of one frame is fed from input buffer 
52 to decoder 53 in response to an instruction from controller 
56 and decoding thereof is initiated according to the 
algorithm of H.26L. 
20 [0077] 

A decoded image is once stored in output buffer 54. 
The image stored in the output buffer 54 is fed back via 
bus M5 to decoder 53 and thereafter is used as a reference 
image for decoding of the next image. On the other hand, 
25 the data of no_output_of_prior_pics_f lag added to the IDR 

image decoded by decoder 53 is fed to controller 56. 



34 



FP03-0158-00 



[0078] 

The controller 56 refers to the value of 
no_output_of_prior_pics_f lag ("0" or "1") and controls the 
output buffer 54, based thereon. Specifically, when 
5 no_output_of_prior_pics_f lag is "1," the reference images 

and display queueing images both become unnecessary and thus 
the controller 56 outputs a command to delete all the images 
stored in the output buffer 54, via bus MB. When 
no_output_of_prior_pics_f lag is "0" on the other hand, the 

10 image corresponding thereto is an IDR image and the controller 

56 eliminates use of all the images used for reference 
(reference images) and deletes them from the output buffer 
54. However, the display queueing images are not deleted. 
Each display queueing image is displayed at its time on a 

15 display device. 

[0079] 

The present embodiment described the example in which 
the present invention was substantialized on the basis of 
H.2 6L, but the video encoding methods to which the present 
2 0 invention can be applied are not limited to H.2 6L but include 

a variety of video encoding methods using the bacJcward 
inter frame prediction. 
[0080] 

Furthermore, the present embodiment was configured so 
25 that the syntax for conveying no use of output queueing frames 

was defined as an additional definition of the NAL unit type 
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announced in the NAL unit syntax and so that the syntax by 
the fixed-length codes was added for conveying the output 
time, but it is a matter of course that the definitions and 
syntaxes for conveying thereof are not limited to these. 
5 Variable-length codes may also be used instead of the 

fixed-length codes for the announcement of output time, and 
the unit of output time can be any other unit than 90 kHz; 
for example, the unit of output time may be based on the 
time unit defined in num_units_in_tick and time_scale 
10 subsidiarily announced as Video Usability Information (VUI) 

in H.2 6L. 
[0081] 

It can also be conveyed by one of various syntaxes that 
can convey information for application in frame units, as 
15 well as the NAL unit syntax. For example, in H.26L, a syntax 

may be added into the Supplemental Enhancement Information 
Message syntax. 
[0082] 

As another means, it is also possible to extend the 
20 memory management control operation (MMCO) commands defined 

for control of reference frames inH.26L. In this case, since 
the Reset command defined as an MMCO command defines the 
same operation as the IDR picture in the foregoing NAL unit 
type, it is also possible, for example, to define an output 
25 queueing frame reset (Reordering Buffer Reset) command, as 

in the case of the definition in the NAL unit type. The 
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definition of operation by this command may be determined 
to be similar to the definition of the operation with the 
IBR picture in the NAL unit type. 
[0083] 

5 When another video coding method is applied, it is 

possible to use a syntax for conveying information applied 
to frame units in that codingmethod . It may also be announced 
outside encoded data in the video coding method, as in the 
case of ITU-T Recommendation H. 245 utilized for announcement 
10 of control information in communication using H,263. 

[0084] 

Lastly, a video encoding program and a video decoding 
program according to the present invention will be described 
with reference to Fig. 9. 

15 As shown in Fig. 9, video processing program 310 is 

stored in program storage area 30a formed in recording medium 
30. The video processing program 310 can be executed by the 
video processing system 100 shown in Fig. 8 and has main 
module 311 responsible for video processing, after-stated 

20 video encoding program 312, and after-stated video decoding 

program 313. 
[0085] 

The video encoding program 312 is provided with chosen 
information output module 312a. The function 
25 substantializedby operation of the chosen information output 

module 312a is much the same as the function of the chosen 
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information output part 11 of the video encoding device 10. 
[0086] 

The video decoding program 313 is provided with decoded 
image use eliminating module 313a. The function 
5 substantialized by operation of the decoded image use 

eliminating module 313a is much the same as the function 
of the decoded image use eliminating part 21 of the video 
decoding device 20. 
[0087] 

10 The video processing program 310 may be configured so 

that part or all thereof is received through a transmission 
medium such as a communication line by another device to 
be recorded. Conversely, the video processing program 310 
may also be configured to be transmitted through a 

15 communication medium to be installed in another device. 

[0088] 

From the invention thus described, it will be obvious 
that the embodiments of the invention may be varied in many 
ways. Such variations are not to be regarded as a departure 
20 from the spirit and scope of the invention, and all such 

modifications as would be obvious to one skilled in the art 
are intended for inclusion within the scope of the following 
claims . 
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